164 research outputs found

    Random Forests model selection

    Get PDF
    Random Forests (RF) of tree classifiers are a popular ensemble method for classification. RF have shown to be effective in many different real world classification problems and nowadays are considered as one of the best learning algorithms in this context. In this paper we discuss the effect of the hyperparameters of the RF over the accuracy of the final model, with particular reference to different theoretically grounded weighing strategies of the tree in the forest. In this way we go against the common misconception which considers RF as an hyperparameter-free learning algorithm. Results on a series of benchmark datasets show that performing an accurate Model Selection procedure can greatly improve the accuracy of the final RF classifier

    Tuning the distribution dependent prior in the PAC-Bayes framework based on empirical data

    Get PDF
    In this paper we further develop the idea that the PAC-Bayes prior can be defined based on the data-generating distribution. In particular, following Catoni [1], we refine some recent generalisation bounds on the risk of the Gibbs Classifier, when the prior is defined in terms of the data generating distribution, and the posterior is defined in terms of the observed one. Moreover we show that the prior and the posterior distributions can be tuned based on the observed samples without worsening the convergence rate of the bounds and with a marginal impact on their constants

    The Digital Kernel Perceptron

    Get PDF
    In this paper, we show that a kernel-based perceptron can be efficiently implemented in digital hardware using very few components. Despite its simplicity, the experimental results on standard data sets show remarkable performance in terms of generalization error

    Human activity recognition on smartphones using a multiclass hardware-friendly support vector machine

    Get PDF
    Activity-Based Computing aims to capture the state of the user and its environment by exploiting heterogeneous sensors in order to provide adaptation to exogenous computing resources. When these sensors are attached to the subject’s body, they permit continuous monitoring of numerous physiological signals. This has appealing use in healthcare applications, e.g. the exploitation of Ambient Intelligence (AmI) in daily activity monitoring for elderly people. In this paper, we present a system for human physical Activity Recognition (AR) using smartphone inertial sensors. As these mobile phones are limited in terms of energy and computing power, we propose a novel hardware-friendly approach for multiclass classification. This method adapts the standard Support Vector Machine (SVM) and exploits fixed-point arithmetic for computational cost reduction. A comparison with the traditional SVM shows a significant improvement in terms of computational costs while maintaining similar accuracy, which can contribute to develop more sustainable systems for AmI.Peer ReviewedPostprint (author's final draft

    Spectral Analysis of Electricity Demand Using Hilbert–Huang Transform

    Get PDF
    The large amount of sensors in modern electrical networks poses a serious challenge in the data processing side. For many years, spectral analysis has been one of the most used approaches to extract physically meaningful information from a sea of data. Fourier Transform (FT) and Wavelet Transform (WT) are by far the most employed tools in this analysis. In this paper we explore the alternative use of Hilbert–Huang Transform (HHT) for electricity demand spectral representation. A sequence of hourly consumptions, spanning 40 months of electrical demand in Spain, has been used as dataset. First, by Empirical Mode Decomposition (EMD), the sequence has been time-represented as an ensemble of 13 Intrinsic Mode Functions (IMFs). Later on, by applying Hilbert Transform (HT) to every IMF, an HHT spectrum has been obtained. Results show smoother spectra with more defined shapes and an excellent frequency resolution. EMD also fosters a deeper analysis of abnormal electricity demand at different timescales. Additionally, EMD permits information compression, which becomes very significant for lossless sequence representation. A 35% reduction has been obtained for the electricity demand sequence. On the negative side, HHT demands more computer resources than conventional spectral analysis techniques

    Unintrusive Monitoring of Induction Motors Bearings via Deep Learning on Stator Currents

    Get PDF
    Induction motors are fundamental components of several modern automation system, and they are one of the central pivot of the developing e-mobility era. The most vulnerable parts of an induction motor are the bearings, the stator winding and the rotor bars. Consequently, monitoring and maintaining them during operations is vital. In this work, authors propose an Induction Motors bearings monitoring tool which leverages on stator currents signals processed with a Deep Learning architecture. Differently from the state-of-the-art approaches which exploit vibration signals, collected by easily damageable and intrusive vibration probes, the stator currents signals are already commonly available, or easily and unintrusively collectable. Moreover, instead of using now-classical data-driven models, authors exploit a Deep Learning architecture able to extract from the stator current signal a compact and expressive representation of the bearings state, ultimately providing a bearing fault detection system. In order to estimate the effectiveness of the proposal, authors collected a series of data from an inverter-fed motor mounting different artificially damaged bearings. Results show that the proposed approach provides a promising and effective yet simple bearing fault detection system

    Support vector machines for interval discriminant analysis

    Get PDF
    The use of data represented by intervals can be caused by imprecision in the input information, incompleteness in patterns, discretization procedures, prior knowledge insertion or speed-up learning. All the existing support vector machine (SVM) approaches working on interval data use local kernels based on a certain distance between intervals, either by combining the interval distance with a kernel or by explicitly defining an interval kernel. This article introduces a new procedure for the linearly separable case, derived from convex optimization theory, inserting information directly into the standard SVM in the form of intervals, without taking any particular distance into consideration.Ministerio de EducaciĂłn y Ciencia DPI2006-15630- C02-0

    Measuring the expressivity of graph kernels through the rademacher complexity

    Get PDF
    Graph kernels are widely adopted in real-world applications that involve learning on graph data. Different graph kernels have been proposed in literature, but no theoretical comparison among them is present. In this paper we provide a formal definition for the expressiveness of a graph kernel by means of the Rademacher Complexity, and analyze the differences among some state-of-the-art graph kernels. Results on real world datasets confirm some known properties of graph kernels, showing that the Rademacher Complexity is indeed a suitable measure for this analysis
    • …
    corecore